Adaptive Tradeoff in Metadata-based Small File Optimizations for a Cluster File System
نویسندگان
چکیده
Metadata-based optimizations are the common methods to improve small files performance in local file systems. However, serval problems will be introduced when applying the similar optimizations for small files in cluster file systems. In this paper, we study the tradeoffs between the performance of metadata and small files in metadata-based optimizations for a cluster file system. Our method aims to guarantee the metadata performance by adaptively migrating small files among file system nodes. We establish a theory model to analyze the small files load need to be migrated. To compute the migrated load in advance, a novel forecasting method is devised to accurately predict the one-step-ahead load of metadata and small files on a MDS. Then we propose a adaptive small file threshold model to decide the small files to be migrated. In the model, we consider the long-term and short-term tradeoffs respectively. To reduce the migration overhead, we discuss the migration tradeoffs for small files and present methods and schemes to eliminate unnecessary overheads. Finally, experiments are performed on a cluster file system and the results show the efficiency of our method in terms of promoting the load forecasting accuracy, trading off the performance of metadata and small files, and reducing migration overhead.
منابع مشابه
Directory-Based Metadata Optimizations for Small Files in PVFS
Modern file systems maintain extensive metadata about stored files. While this usually is useful, there are situations when the additional overhead of such a design becomes a problem in terms of performance. This is especially true for parallel and cluster file systems, because due to their design every metadata operation is even more expensive. In this paper several changes made to the paralle...
متن کاملDynamic file system semantics to enable metadata optimizations in PVFS
Modern file systems maintain extensive metadata about stored files. While metadata typically is useful, there are situations when the additional overhead of such a design becomes a problem in terms of performance. This is especially true for parallel and cluster file systems, where every metadata operation is even more expensive due to their architecture. In this paper several changes made to t...
متن کاملThe Composite-file File System: Decoupling the One-to-One Mapping of Files and Metadata for Better Performance
Traditional file system optimizations typically use a oneto-one mapping of logical files to their physical metadata representations. This mapping results in missed opportunities for a class of optimizations in which such coupling is removed. We have designed, implemented, and evaluated a composite-file file system, which allows many-to-one mappings of files to metadata, and we have explored the...
متن کاملDynamic Hashing: Adaptive Metadata Management for Petabyte-scale File Systems∗
In a petabyte-scale file system, metadata access performance and scalability will significantly affect the whole system’s performance and scalability. We present a new approach called Dynamic Hashing (DH) for metadata management. DH introduces the RELAB (RElative LoAd Balancing) strategy to adjust the metadata distribution when the workload changes dynamically. Elasticity strategy is proposed t...
متن کاملScalable Performance of the Panasas Parallel File System
The Panasas file system uses parallel and redundant access to object storage devices (OSDs), per-file RAID, distributed metadata management, consistent client caching, file locking services, and internal cluster management to provide a scalable, fault tolerant, high performance distributed file system. The clustered design of the storage system and the use of clientdriven RAID provide scalable ...
متن کامل